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Abstract 

In  laboratory  visual  search  experiments,  distractors  are  often  statistically  independent  of 
each  other.  However,  stimuli  in  more  naturalistic  settings  are  often  correlated  and  rarely 
independent.  Here,  we  examine  whether  human  observers  take  stimulus  correlations  into 
account  in  orientation  target  detection.  We  find  that  they  do,  although  probably  not  opti¬ 
mally.  In  particular,  it  seems  that  low  distractor  correlations  are  overestimated.  Our  results 
might  contribute  to  bridging  the  gap  between  artificial  and  natural  visual  search  tasks. 


Introduction 

Visual  target  detection  in  displays  consisting  of  multiple  simple  stimuli  is  a  mainstay  in  visual 
science.  Within  this  group  of  tasks,  two  classes  can  be  distinguished:  ones  in  which  the  distrac¬ 
tors  are  identical  to  each  other  (homogeneous),  and  ones  in  which  they  are  not  (heterogeneous) 
[1].  Models  have  focused  on  homogeneous-distractor  tasks,  in  which  the  value  of  the  distrac¬ 
tors  is  fixed  across  trials  [2-9],  For  example,  an  observer  might  be  detecting  a  vertically  ori¬ 
ented  target  among  distractors  that  are  always  tilted  5°  clockwise,  or  a  signal  among  N  image 
patches  that  otherwise  consist  of  only  pixel  noise.  In  such  conditions,  human  performance  is 
well  described  by  either  a  model  in  which  the  observer  uses  a  maximum-of-outputs  rule  [9, 10] 
or  a  Bayesian  maxim  um-a-posteriori  rule  [4, 10].  In  another  type  of  homogeneous-distractor 
task,  the  distractors  are  identical  to  each  other  but  their  value  varies  across  trials  [  1 1  ] .  A  limita¬ 
tion  of  studies  using  homogeneous  distractor  sets  is  that  stimuli  outside  the  laboratory  are 
often  heterogeneous.  For  example,  when  detecting  an  animal  hidden  in  the  bushes,  a  friend  in 
a  crowd,  keys  in  a  cluttered  drawer,  or  a  tumor  on  a  CT  scan,  distractors  typically  vary  in  their 
features  both  across  space  and  across  time.  Modeling  work  on  heterogeneous  search — which 
has  not  been  as  extensive  as  modeling  of  homogeneous  search — has  found  that  a  Bayesian  - 
observer  model  provides  a  good  description  of  human  search  for  a  fixed  target  among  distrac¬ 
tors  that  are  drawn  independently  from  either  a  uniform  [12, 13]  or  a  normal  distribution  [11, 
14]  (although  perhaps  less  so  when  the  distribution  is  more  complex  [15]). 
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The  assumption  of  independent  distractor  is  probably  not  correct  in  more  naturalistic  set¬ 
tings,  where  distracting  elements  will  often  have  structure  amongst  themselves  and  therefore 
be  correlated.  Here,  we  ask  whether  human  observers  take  stimulus  correlations  into  account 
when  detecting  a  target  among  distractors,  and  in  particular  whether  they  may  assume  correla¬ 
tions  where  none  exist,  as  has  been  observed  in  other  contexts  [16-19].  We  find  that  human 
observers  take  correlations  into  account,  but  indeed  overestimate  low  correlations. 


Experimental  Methods 

Task 

We  conducted  a  target  detection  experiment  in  which  observers  were  presented  with  four  ori¬ 
ented  Gabor  patches  (Fig  1).  The  search  target  was  a  vertically  oriented  Gabor  patch  and  was 
present  with  probability  0.5  at  a  randomly  chosen  location.  The  task  of  the  observers  was  to 
report  on  each  trial  whether  the  target  was  present.  We  refer  to  orientations  of  patches  that 
were  not  the  target  as  “distractors”.  Distractor  orientations  were  drawn  from  a  multivariate 
normal  distribution.  The  marginal  distribution  of  each  distractor  had  a  mean  of  0°  (vertical) 
and  a  standard  deviation  of  15°.  The  amount  of  structure  within  a  display  was  controlled  by  the 
correlation  coefficient,  p,  between  distractor  orientations.  We  used  uniform  correlations,  which 
mean  that  p  was  the  same  for  all  distractor  pairs  (Fig  lb).  In  a  given  experimental  session,  p 
took  one  of  four  values:  0  (independent  distractors),  16,  %,  or  1  (identical  distractors). 

Subjects 

Eleven  subjects  (6  male,  5  female)  participated  in  the  experiment.  All  subjects  had  normal  or 
corrected-to-normal  acuity  and  gave  written  informed  consent.  The  study  was  approved  by  the 
Institutional  Review  Board  of  the  Baylor  College  of  Medicine,  Houston,  Texas. 


(a) 


(b)  Target  absent  Target  present 


Fig  1 .  Experimental  procedure  and  sample  displays,  (a)  Time  course  of  a  trial,  (b)  Sample  displays  for  each  of  the  correlation  coefficients  used.  In  a  given 
experimental  session,  only  one  value  of  p  was  used. 

doi:10.1371/journal.pone.0149402.g001 


PLOS  ONE  |  DOI:1 0.1 371 /journal. pone. 01 49402  March  1 0,  201 6 


2/16 


Stimulus  Correlations  in  Visual  Search 


Apparatus  and  stimuli 

Stimuli  were  presented  on  a  21"  LCD  monitor  with  a  refresh  rate  of  60  Hz.  Subjects  viewed  the 
displays  from  a  distance  of  approximately  60  cm.  The  background  luminance  was  33.1  cd/m2. 
A  set  of  4  stimuli  was  shown  on  each  trial.  On  target-present  trials,  the  stimulus  set  consisted 
of  1  target  and  3  distractors  while  on  target-absent  trials,  it  consisted  of  4  distractors.  A  target 
was  present  on  exactly  half  the  trials.  Each  stimulus  was  a  Gabor  patch  with  a  spatial  frequency 
of  approximately  2.67  cycles/deg,  a  standard  deviation  of  0.26  deg,  and  a  peak  luminance  of 
136  cd/m2  (which  corresponds  to  a  Michelson  contrast  of  0.61).  Stimuli  were  placed  on  an 
invisible  circle  centered  at  the  fixation  cross,  with  a  radius  of  3.2  degrees  of  visual  angle.  On 
each  trial,  the  first  stimulus  was  placed  at  a  random  position  along  the  circle,  and  the  other  sti¬ 
muli  were  placed  so  that  the  angular  distance  between  two  adjacent  stimuli  was  always  45°.  On 
target-present  trials,  each  location  was  equally  likely  to  contain  the  target.  The  standard  devia¬ 
tion  of  the  distractor  distribution,  crs,  was  fixed  at  15°  while  the  correlation  coefficient,  p,  was 
varied  across  different  experimental  sessions. 


Procedure 

Each  subject  participated  in  four  sessions.  Each  session  lasted  about  50  minutes  and  was  run 
on  a  different  day  or  on  the  same  day  with  an  interval  of  at  least  an  hour  between  consecutive 
sessions.  No  more  than  two  sessions  were  run  on  a  single  day  for  a  subject.  Within  each  session 
the  correlation  coefficient  p  was  fixed  at  one  of  the  values  0,  Vi,  %,  or  1.  The  order  of  the  ses¬ 
sions  was  randomized  across  subjects.  Each  session  consisted  of  one  training  block  of  50  trials 
and  6  testing  blocks  of  150  trials  each.  Each  training  trial  began  with  the  display  of  a  fixation 
cross  at  the  center  of  the  screen  (500  ms),  followed  by  the  stimulus  display  containing  4  stimuli 
(100  ms).  After  the  stimuli  were  presented,  only  the  fixation  cross  was  displayed  until  the  sub¬ 
ject  responded  (Fig  la).  Subjects  reported  through  a  key  press  whether  the  target  was  present 
or  absent.  After  each  response,  feedback  was  provided  by  coloring  the  fixation  cross  green  (cor¬ 
rect)  or  red  (incorrect)  for  750  ms.  During  training,  this  was  followed  by  a  second  presentation 
of  the  stimuli  for  2  s,  with  a  blue  circle  identifying  the  target  stimulus  if  one  was  present.  Test¬ 
ing  trials  were  identical  to  training  trials,  except  that  feedback  was  provided  only  by  changing 
the  color  of  the  fixation  cross;  the  stimuli  were  not  redisplayed.  A  subject’s  performance  was 
revealed  after  the  completion  of  each  block  of  150  trials,  along  with  the  scores  of  the  other  sub¬ 
jects  who  had  completed  the  same  session.  Each  subject  completed  a  total  of  3600  test  trials.  At 
the  beginning  of  the  first  session,  we  explained  the  trial  procedure  while  demonstrating  one 
training  trial  step  by  step.  After  that,  the  subject  completed  9  more  practice  trials  in  the  pres¬ 
ence  of  the  experimenter.  At  the  end  of  the  first  session,  we  told  the  subject  that  in  the  next  ses¬ 
sion,  the  type  of  display  would  be  slightly  different  from  what  they  had  experienced  in  the  first 
session.  We  never  told  subjects  explicitly  about  correlations. 


Experimental  Results 

Distractor  correlation  had  a  significant  effect  on  the  proportion  of  correct  responses  (repeated- 
measures  ANOVA:  E(3,40)  =  15.75, p<0. 0001;  Fig  2a).  This  effect  is  still  present  when  the  hit 
and  false  alarm  rates  are  analyzed  separately  (hit  rate:  F(3,40)  =  5.57,  p  =  0.0027;  false-alarm 
rate:  F(3,40)  =  8.14,  p  =  0.0002;  Fig  2b)  and  seems  to  be  mostly  driven  by  the  performance 
increase  in  the  p  =  1  condition.  To  visualize  the  subject  data  and  model  fits,  we  computed  two 
summary  statistics,  separately  for  each  p-condition:  the  proportion  of  target  present  responses 
as  function  of  both  the  standard  deviation  of  the  distractor  set  and  as  function  of  the  minimum 
difference  between  the  orientation  of  the  target  and  any  distractor. 
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(a) 


(b) 


Correlation 


Fig  2.  Psychometric  curves  1 .  (a)  Proportion  correct  responses  and  (b)  hit  and  false  alarm  rates  as  a  function  of  distractor  correlation.  Throughout  the 
paper,  error  bars  indicate  one  standard  error  of  the  mean  (s.e.m). 

doi:  1 0. 1 371  /journal  .pone.0 1 49402.g002 


On  average,  the  proportion  of  “target  present”  responses  decreases  as  a  function  of  the  stan¬ 
dard  deviation  of  the  distractor  set,  except  in  the  p  =  1  condition  (Fig  3a).  Note  that  the  number 
of  trials  per  bin  differs  both  across  bins  and  across  correlation  conditions  (Fig  3b).  In  particu¬ 
lar,  in  the  target-absent  trials,  the  standard  deviation  of  a  distractor  set  in  the  homogeneous 
condition  (p=  1)  is  always  zero. 

Similarly,  the  proportion  of  “target  present”  responses  generally  decreases  with  the  mini¬ 
mum  angle  between  the  distractors  and  the  (vertical)  target  orientation  (Fig  4a).  The  differ¬ 
ences  in  the  numbers  of  trials  per  bin  (Fig  4b)  produce  a  paradox:  for  example,  in  the  target- 
absent  condition  (Fig  4a,  right),  the  entire  p  =  1  curve  lies  above  the  p  =  0  curve,  even  though 
subjects  respond  “target  present”  overall  less  in  the  p  =  1  condition  (Fig  2b).  This  is  an  instance 
of  Simpson’s  paradox  [20];  it  is  resolved  by  realizing  that  the  trials  in  the  p  =  0  are  heavily 
weighted  towards  bins  corresponding  to  smaller  values  (Fig  4b,  right). 


Models 

To  determine  whether  and  how  subjects  took  correlations  into  account  in  this  visual  search 
task,  we  fitted  the  optimal- observer  model  and  several  alternative  models  to  the  data.  Flere,  we 
first  describe  the  generative  model — which  specifies  how  observations  are  statistically  related 
to  the  stimuli — in  its  most  general  form.  We  then  derive  the  optimal  decision  rule.  Finally,  we 
give  an  overview  of  the  models  that  we  fitted  to  the  data.  All  models  are  variations  of  the  opti¬ 
mal-observer  model. 


Specification  of  the  generative  model 

The  first  step  of  Bayesian  modeling  is  to  define  the  task- relevant  random  variables  and  their 
dependencies,  collectively  called  the  generative  model.  Although  the  number  of  stimuli,  N,  was 
always  4  in  our  experiment,  we  present  our  model  for  general  N.  We  denote  target  presence  by 
a  binary  variable  T,  with  T  =  0  denoting  "target  absent"  and  T  =  1  denoting  "target  present". 

The  probability  of  target  presence,  p{T  =  1),  is  equal  to  0.5.  When  T  =  1,  a  target  location  is  cho¬ 
sen  with  uniform  probability.  The  target  orientation  is  always  vertical,  which  we  define  as  0°. 
We  denote  the  vector  of  stimulus  orientations  by  s  =  (slv  .  .,sN). 
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(a)  Target  present  Target  absent 


Sample  standard  deviation  of  stimulus  orientations  (°) 

Fig  3.  Psychometric  curves  2.  (a)  Proportion  “target  present”  responses  and  (b)  number  of  trials  as  a  function  of  standard  deviation  of  the  distractor  set 
coefficient,  averaged  across  subjects.  Bin  size  was  2.5°,  except  that  all  trials  with  sample  standard  deviation  greater  than  17.5°  are  collected  in  the  last  bin. 
The  plots  in  (b)  are  entirely  determined  by  the  stimuli,  not  by  the  subject  responses;  they  serve  to  emphasize  that  the  points  in  the  plots  in  (a)  were  computed 
on  widely  differing  numbers  of  trials. 

doi:  1 0. 1 371  /journal  .pone.0 1 49402.g003 


On  a  target-absent  trial,  s  is  drawn  from  a  N-dimensional  multivariate  normal  distribution 
with  mean  (0,. .  .,0)  and  covariance  Es,  which  for  N  =  4  is 
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Here,  the  correlation  coefficient  p  is  between  0  and  1.  When  p  =  0,  the  orientations  of  all  de¬ 
tractors  are  chosen  independently  (maximal  heterogeneity);  when  p=  1,  they  are  identical 
(homogeneous).  This  design  interpolates  between  the  homogeneous  and  heterogeneous  condi¬ 
tions  in  an  earlier  study  [11]. 

On  a  target-present  trial  with  target  at  location),  the  orientations  of  the  N- 1  distractors,  S\j  = 
(si,. .  .,Sj_i,Sj+i,. . . ,sN ),  are  drawn  from  a  (N-l)-dimensional  multivariate  normal  distribution 
with  mean,  0N_!  =  (0,. .  .,0)  and  covariance,  £s\j-  The  notation  \j  refers  to  the  set  of  distractors 
when  the  target  is  present  at  location  j.  The  (N-l)x(N-l)  covariance  matrix,  Es\j,  is  obtained  by 
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Minimum  target-distractor  orientation  difference  (°) 

Fig  4.  Psychometric  curves  3.  (a)  Proportion  “target  present”  responses  and  (b)  number  of  trials  as  a  function  of  minimum  target-distractor  orientation 
difference  for  target  present  (left)  and  absent  (right)  trials.  Bin  size  was  2°,  except  that  all  trials  with  minimum  target-distractor  orientation  difference  greater 
than  1 0°  are  collected  in  the  last  bin.  The  plots  in  (b)  are  entirely  determined  by  the  stimuli,  not  by  the  subject  responses;  they  help  to  reconcile  the  plots  in  (a) 
with  Fig  2b. 

doi:10.1371/journal.pone.0149402.g004 


removing  the  jth  row  and  the/h  column  of  £s,  and  we  write 

P(s\j \T=1)  =  N  (sV;0N-l,Es\;) 

We  denote  the  observer's  vector  of  stimulus  measurements  by  x  =  (x^,. . .,  xN).  We  assume 
that  the  stimulus  measurements  are  corrupted  by  zero-mean  Gaussian  noise,  so  that,  for  the  ith 
location,  we  have 

p(xi\si)  =  N  (xi;si,ai2). 

We  further  assume  that  measurement  noise  is  independent  between  locations. 


Optimal  decision  rule 

Optimal  observers  infer  whether  a  target  is  present  or  not  from  the  stimulus  measurements,  x, 
by  using  their  knowledge  of  the  generative  model.  Specifically,  an  optimal  observer  computes  p 
( T  =  l|x)  andp(T  =  0|x)  and  reports  which  possibility  is  more  probable.  This  is  equivalent  to 
computing  the  log  posterior  ratio, 


d  =  log 


P(T=  l|x) 
p{T  =  0|x) 


=  log 


P(x|T=1) 

p(x|T=0) 


“f  log 


P(T=  1) 
p(T  =  0)  ’ 


(1) 
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d  =  log 

V 


and  reporting  "target  present"  if  d> 0  and  "target  absent"  otherwise.  If  the  optimal  observer 
assumes  equal  probabilities  for  T  =  0  and  T  =  1,  then  we  find  that  d  is  given  by  (see 
SI  A  Appendix) 


N 

E 

;= i 
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Thus,  the  decision  variable  maps  the  stimulus  measurements,  x,  and  the  variances  of  the 
noise  in  each  measurement,  rr,2,. . .,  aN2,  to  a  real  number.  The  dependence  of  the  decision  vari¬ 
able  on  the  measurements  is  complex  and  difficult  to  interpret  in  general.  However,  the  cases  p 
=  0  and  p  =  1  are  intuitive  and  tractable.  When  p  =  0,  distractor  orientations  are  chosen  inde¬ 
pendently,  and  the  decision  variable  is  given  by: 


In  this  case,  the  optimal  observer  makes  a  decision  based  on  a  weighted  average  of  all  stimu¬ 
lus  measurements  [11].  The  weights  are  determined  by  the  uncertainty  of  each  measurement. 

A  measurement  closer  to  0  provides  stronger  evidence  that  a  target  is  present.  When  p  =  1,  all 
distractors  are  identical  and  the  decision  variable  is  given  by 


nY2  V1+/A>exP 


In  this  case,  the  optimal  observer  compares  the  squared  weighted  mean  over  all  measure¬ 
ments  to  the  squared  weighted  mean  over  all  observations  excluding  a  putative  target  [11]. 
Roughly  speaking,  if  the/*1  item  is  the  target,  the  difference  between  these  two  quantities  will 
be  more  negative  than  if  the  target  is  absent,  so  the  exponential  term  is  higher,  contributing  to 
the  overall  evidence  for  target  presence. 

So  far,  we  have  assumed  that  the  observer  knows  that  the  frequencies  of  target-present  and 
target-absent  trials  are  equal,  and  incorporates  this  knowledge.  We  do  not  make  this  assumption 
in  the  models  that  we  fit  to  data.  Instead,  we  allow  for  the  possibility  that  the  observer  behaves 
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PLOS 


as  if  they  believe  that  target-present  trials  occur  with  probability />(T  =  1)  =  ppresent-  As  Eq  1 
shows,  this  prior  probability  will  appear  in  the  expression  for  d  as  an  additive  term  log  ppreBI!°t  . 

Model  overview 

The  models  that  we  fitted  to  the  data  have  two  factors:  the  observer's  assumption  about  the  dis- 
tractor  correlations,  and  the  presence/absence  of  variability  in  the  precision  of  stimulus  mea¬ 
surements.  We  considered  four  possibilities  for  the  first  factor  and  two  for  the  second,  giving  a 
total  of  8  models  (Table  1). 

Observer’s  assumption  about  p.  An  optimal  observer  has  complete  knowledge  of  the  gen¬ 
erative  model,  including  the  values  of  the  correlations  in  all  conditions,  which  we  denote  by  a 
vector  p  =  (0,%,%,1).  There  are  other  assumptions  about  p  that  an  observer  could  be  making, 
leading  to  suboptimal  performance.  We  consider  the  following  four  possible  assumptions: 

•  passumed  =  p  =  (0,V5,%,1):  the  observer  uses  the  correct  values  of  distractor  correlations  (optimal). 

•  Passumed  =  (0, 0,0,0):  the  observer  assumes  that  orientations  are  drawn  independently  of  each 
other  in  all  four  conditions  (which  is  optimal  only  in  the  first  condition). 

•  passumed  =  («,«,«,«) :  the  observer  assumes  that  the  distractor  correlation  is  the  same  in  all 
four  conditions.  The  assumed  value  for  this  correlation,  a,  is  a  free  parameter  fit  to  the  data. 

•  passumed  =  (a,/?,y,d):  the  observer  assumes  a  different  value  for  the  distractor  correlation 
across  experimental  conditions  The  assumed  correlations,  a,  fi,  y,  and  8  are  free  parameters. 


Presence  of  variability  in  encoding  precision.  Recent  studies  have  found  evidence  that 
the  level  of  measurement  noise  can  vary  across  trials  and  across  locations  within  a  trial  [11, 13, 
21-25].  Therefore,  we  considered  two  types  of  models: 

•  Equal-precision  (EP)  models,  in  which  measurements  have  the  same  precision  (inverse  vari¬ 
ance)  across  trials  and  stimuli.  In  this  type  of  model  measurement  precision,  /,■  =  J  for  all  i. 

•  Variable-precision  (VP)  models,  in  which  measurement  precision  is  a  random  variable.  In 
line  with  previous  work  [13, 23],  we  assumed  that  each  element  in  the  precision  vector 

J  =  (Ji,. . .  ,/,v)  follows  a  Gamma  distribution  with  mean  and  scale  parameter  r.  Note  that  J, 
and  r  are  hyperparameters  in  the  VP  models.  Each  value  in  the  vector  is  sampled  indepen¬ 
dently  across  trials  and  stimuli. 


Table  1.  Summary  of  models.  The  models  are  organized  according  to  two  factors:  the  presence  of  variability  in  measurement  precision  (EP  and  VP),  and 
the  observer's  assumption  about  the  correlation  coefficients,  p. 


Precision 

Model  name 

Observer’s  assumption  about  p 

Number  of  free  parameters 

EP 

EP1 

Passumed  =  P  =  (0,  >  >"0 

3  (Ppresent  J) 

EP2 

Passumed  =  (0,0, 0,0) 

3  (Ppresent  J) 

EP3 

Passumed  =  (tf,  Of,  O',  Of) 

0  (Ppresent)  '■A  ^fld  Of) 

EP4 

Passumed  = 

6  (Ppresent,  J,  <7  jS,  /,  and  5) 

VP 

VP1 

Passumed  =  P  =  (0,  >  >1) 

3  (Ppresent)  J  >  3Hd  t) 

VP2 

Passumed  =  (0,0, 0,0) 

3  (Ppresent)  ,  and  t) 

VP3 

Passumed  =  (Of,  Of,  Of,  Of) 

4  (Ppresent,  J,  rand  Cf) 

VP4 

Passumed  =  {&$>¥>&) 

7  (Ppresenti  J  >  T,  Q,  (3,  Y,  and  5) 

doi:10.1371/journal.pone.0149402.t001 
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Model  Comparison  Results 

Our  main  question  is  whether  and  how  humans  take  into  account  stimulus  correlations  in 
visual  search.  A  secondary  question  that  we  address  is  whether  the  current  study  supports  the 
evidence  for  variability  in  encoding  precision  that  we  found  in  previous  work  on  visual  search 
[11, 13].  We  first  present  results  pertaining  to  the  second  question,  because  they  turned  out  to 
be  more  clear-cut. 

We  used  maximum-likelihood  estimation  to  fit  our  8  models  to  subject  data  (see  SIB 
Appendix  for  details  on  the  methods  and  SI  Table  for  parameter  estimates).  We  compared 
models  using  the  Akaike  Information  Criterion  (AIC)  and  the  Bayesian  Information  Criterion 
(BIC)  (see  SIC  Appendix).  A  parameter  recovery  analysis  (see  SIB  Appendix)  showed  that  in 
our  case,  BIC  recovers  the  correct  model  more  reliably  than  AIC.  We  find  that  recovery  is 
good,  but  that  correlations  tend  to  be  biased  away  from  the  extreme  values  0  and  1. 


Equal  versus  variable  precision 

We  compared  the  fit  of  each  equal-precision  model  with  its  variable-precision  counterpart.  Fig  5 
shows  that  regardless  of  the  observer’s  assumption  about  correlations,  the  variable-precision 
models  better  fit  the  data.  This  agrees  with  previous  results  [11,13].  Therefore,  we  only  consider 
the  variable-precision  models  in  further  analyses. 


Do  subjects  take  stimulus  correlations  into  account? 

We  next  examine  whether  subjects  take  into  account  correlations  between  distractor  orienta¬ 
tions  when  inferring  the  presence  of  a  target.  We  found  that  subop timal  model  VP4  provided 
the  best  fit  to  the  data  of  each  of  the  11  subjects  (Fig  6a).  On  average,  the  AIC  value  of  the  VP4 
model  was  50±13  lower  than  that  of  the  optimal  (VP1)  model,  which  provides  strong  evidence 
against  the  hypothesis  that  human  subjects  take  stimulus  correlations  into  account  in  an  opti¬ 
mal  manner.  Model  VP4  also  outperforms  VP2  and  VP3  (on  average  by  128±38  and  65±16, 
respectively),  indicating  that  subjects  do  not  assume  zero  or  identical  correlations  across  condi¬ 
tions.  Hence,  it  seems  that  the  subjects  did  take  stimulus  correlations  into  account  in  their  deci¬ 
sions,  but  in  a  way  that  deviated  substantially  from  the  optimal  strategy. 

The  estimates  of  the  observer's  assumed  values  of  the  correlation  coefficient  in  the  VP4 
model  are  shown  in  Fig  6b.  While  these  estimates  suggest  that  subjects  overestimate  low  corre¬ 
lations  and  underestimate  high  ones,  these  estimates  should  be  interpreted  with  caution,  for 
the  following  reasons: 
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Fig  5.  AIC  model  comparison  for  equal  versus  variable  precision.  Shown  are  AIC  differences  of  EP  models  relative  to  VP  models  for  each  subject  (left) 
and  averaged  over  subjects  (right).  Higher  AIC  mean  worse  fits.  BIC  results  are  consistent  (SI  a  Fig). 
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Fig  6.  AIC  model  comparison  of  VP  models  for  observer’s  assumption  about  p  and  parameter  estimates  of  VP4  model  Passumed-  (a)  Shown  are  AIC 
differences  of  VP  models  relative  to  VP4  (most  general)  model  for  each  subject  (left)  and  averaged  across  subjects  (right),  (b)  ML  estimates  of  pasSumed  from 
the  VP4  model  for  each  subject  (colors)  and  averaged  (black).  BIC  results  are  consistent  (see  SI  b  Fig). 
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•  both  the  uncertainty  in  the  parameter  estimates  within  a  subject  and  the  variability  across 
subjects  are  large  partly  due  to  limited  data; 

•  if  a  model  does  not  fit  well  (as  is  the  case  in  the  p  =  1  condition  (Fig  7),  its  parameters  are  not 
meaningful; 
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Fig  7.  Fits  of  the  VP4  model  to  the  summary  statistics,  (a)  Proportion  correct  (top),  hit,  and  false-alarm  rates  (bottom)  as  a  function  of  distractor 
correlation.  Proportion  “target  present”  responses  as  a  function  of  (b)  standard  deviation  of  the  distractor  set,  and  (c)  minimum  target-distractor  orientation 
difference,  averaged  across  subjects,  separately  for  target  present  (black)  and  target  absent  (red)  trials.  Numbers  indicate  root-mean  square  error  (blue)  and 
R2  statistics  (green)  between  model  and  data. 
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Fig  8.  Fits  of  the  VP1  model  to  the  summary  statistics.  For  caption,  see  Fig  7. 
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•  in  synthetic  data  generated  from  the  VP4  model,  the  correlation  coefficient  is  also  misesti¬ 
mated  (see  SIB  Appendix),  with  a  similar  (but  weaker)  trend  as  in  Fig  6b. 


We  can  therefore  conclude  that  human  observers  take  correlations  into  account  in  this  tar¬ 
get  detection  task,  however  not  optimally.  Our  models  indicate  that  observers  assume  different 
correlations  under  different  conditions,  but  we  cannot  say  precisely  what  correlations  they  do 
assume. 


Model  fits 

A  model  that  wins  in  a  model  comparison  does  not  necessarily  fit  the  data  well.  To  visualize 
the  performance  of  our  best  model,  VP4,  we  show  how  it  fits  the  psychometric  curves  from 
Figs  2,  3  and  4  in  Fig  7.  For  comparison,  the  fits  of  the  optimal  (VP1)  model  are  shown  in  Fig  8. 
Although  the  VP4  model  provides  an  overall  better  fit,  it  also  deviates  from  the  data  in  appar¬ 
ently  systematic  ways,  especially  in  the  homogeneous  {p  =  1 )  condition. 


Post-hoc  models 

Given  how  poorly  the  models  fit  in  the  p  =  1  condition,  we  examined  a  post-hoc  model  in 
which  mean  precision,/,  depends  on  the  correlation  condition;  we  call  this  the  VP5  model. 
Such  a  dependency  might  be  justified  if  the  items  are  not  encoded  independently,  but  as  a 
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Fig  9.  Fits  of  the  VP5  model  to  the  summary  statistics.  For  caption,  see  Fig  7. 
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configuration  [26],  Alternatively,  differences  in  /  might  reflect  different  degrees  of  suboptimal¬ 
ity  in  an  earlier  stage  of  inference  [27],  In  spite  of  these  justifications,  the  VP5  model  is  ad  hoc. 

The  VP5  model  provides  substantially  better  fits  to  summary  statistics  (Fig  9),  particularly  in 
the  p  =  1  condition.  The  VP5  model  outperforms  all  other  VP  models  in  AIC:  VP1  by  108±25, 
VP2  by  185±52,  VP3  by  123±34,  and  VP4  by  58±22  (Fig  10a).  Parameter  estimates  are  shown 
in  Fig  10b.  Mean  precision,  J,  is  estimated  substantially  higher  in  the  p  =  1  condition  than  in  the 
other  conditions,  suggesting  that  homogeneous  displays  are  encoded  in  a  fundamentally  differ¬ 
ent  (more  efficient)  way  than  heterogeneous  ones.  Furthermore,  passumed  follows  a  similar  rela¬ 
tionship  as  in  the  VP4  model  (Fig  6b).  Hence,  our  conclusion  regarding  how  subjects  take 
correlations  into  account  in  this  task  does  not  strongly  depend  on  the  model  that  we  fit.  More 
experiments,  potentially  with  different  values  of  p,  larger  set  sizes,  and  more  extensive  training 
could  shed  more  light  on  how  exactly  people  misestimate  stimulus  correlations  in  visual  search. 

Discussion 

The  natural  world  is  full  of  correlations  between  stimuli.  Therefore,  to  understand  how  decisions 
are  made  in  natural  environments,  it  is  necessary  to  go  beyond  independent  stimuli  typically 
used  in  psychophysics  and  study  whether  and  how  observers  take  into  account  stimulus  correla¬ 
tions.  There  has  been  recent  interest  in  this  question.  In  contour  integration,  humans  seem  to  be 
taking  into  account  natural  co-occurrence  statistics  of  line  elements  [28].  In  change  detection, 
people  incorporate  knowledge  about  the  large-scale  statistical  structure  of  a  scene  [26, 29].  It  has 
been  proposed  that  overestimation  of  correlations  can  explain  set  size  effects  [30]  [31] . 
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Fig  1 0.  AIC  model  comparison  of  VP  models  relative  to  VP5  model  and  parameter  estimates  of  VP5  model,  (a)  Shown  are  AIC  differences  of  VP 
models  relative  to  VP5  model  for  each  subject  (left)  and  averaged  across  subjects  (right).  BIC  results  are  consistent  (Sic  Fig),  (b)  ML  estimates  of  J  and 
Passumed  from  the  VP5  model  for  each  subject  (colors)  and  averaged  with  standard  error  mean  across  subjects  (black). 
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Here,  we  tested  the  effect  of  introducing  a  nontrivial  statistical  structure  in  a  visual  search  task 
by  asking  subjects  to  detect  a  vertical  target  among  correlated  distractors.  Varying  the  correlation 
coefficient  of  the  distractors  allowed  us  to  compare  several  models  of  human  decision-making, 
all  variants  of  the  optimal-observer  model.  Within  this  set  of  models,  we  were  able  to  rule  out 
that  the  observer  used  the  correct  values  of  the  correlations  in  the  decision  process.  We  were  also 
able  to  rule  out  two  suboptimal-observer  assumptions  about  the  correlations:  that  stimuli  are 
uncorrelated,  or  that  the  correlations  are  constant.  We  found  that  the  best  model  was  the  most 
flexible  one,  in  which  the  assumed  values  of  the  correlations  could  differ  between  all  correlation 
conditions.  A  similar  conclusion  has  been  reached  in  a  study  of  human  subjects  in  a  reaching 
task  in  a  three-dimensional  virtual  reality  environment  [32].  In  that  study,  a  strong  correlation 
was  induced  between  two  dimensions  of  a  randomly  displaced  target,  which  an  optimal  observer 
would  learn  to  take  into  account.  Human  subjects  did  take  these  correlations  into  account,  but 
not  perfectly  and  to  a  degree  that  varied  considerably  between  subjects.  For  later  work,  it  should 
be  kept  in  mind  that  observers  might  not  have  properly  learned  the  joint  distractor  distribution 
in  our  experiment;  this  could  be  improved  through  explicit  instructions,  more  training  trials,  or 
using  more  than  four  stimuli  (so  that  observers  have  a  larger  sample  to  estimate  the  correlation). 

Our  best-fitting  models  (VP4  and  VP5)  suggest  that  humans  take  correlations  between  the 
distractors  into  account  when  inferring  the  presence  of  a  target,  but  in  a  suboptimal  manner. 

In  particular,  people  might  be  assuming  that  correlations  are  non-zero  even  when  they  are  not. 
Such  an  assumption  of  structure  in  a  visual  scene  could  be  sensible  in  light  of  the  prevalence  of 
structured  scenes  in  nature.  Similar  overestimations  of  low  correlations  have  been  reported  in 
the  temporal  domain  [16-19].  Hence,  the  suboptimality  that  we  find  in  our  laboratory 
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experiment  may  reflect  an  optimal  adaptation  to  the  natural  world.  We  note,  however,  that  the 
assumed  correlations  seemed  to  vary  between  subjects,  and  were  difficult  to  estimate  precisely 
from  the  data  (See  SIB  Appendix).  Moreover,  it  does  not  seem  that  people  overestimate  a  cor¬ 
relation  of  zero  by  enough  to  account  for  set  size  effects  in  visual  short-term  memory,  as  was 
recently  proposed  [30]  [31]. 

Variable-precision  models  with  a  standard  encoding  stage  (VP4)  fitted  the  data  reasonably 
well,  except  in  the  p  =  1  condition.  To  fit  all  conditions  well,  we  had  to  construct  an  ad-hoc 
model  (VP5)  in  which  mean  precision  depends  on  correlation  condition.  In  this  model,  mean 
precision  was  estimated  higher  in  the  homogeneous  (p  =  1)  condition;  this  might  be  due  to  a 
texture  detection  or  other  gist  mechanism  that  we  do  not  explicitly  model.  The  difference  in 
mean  precision  between  the  homogeneous  and  the  heterogeneous  conditions  at  the  surface 
seems  inconsistent  with  the  result  shown  in  Fig  9a  of  [1 1],  where  we  did  not  find  a  difference. 
However,  this  might  be  due  to  the  fact  that  in  that  paper,  we  assumed  the  optimal  model  (VP1) 
and  did  not  test  whether  subjects  correctly  assumed  zero  correlation. 

Of  course,  the  present  study  is  still  a  far  cry  from  studying  the  effect  of  stimulus  structure  on 
decision-making  in  natural  scenes,  for  several  reasons.  First,  the  set  size  used  in  our  experiment 
was  small  and  known  to  the  observer,  while  natural  visual  search  tasks  often  involve  a  large 
and  unknown  number  of  distractors.  Second,  our  subjects  were  instructed  to  maintain  fixation, 
which  rarely  happens  when  performing  visual  search  tasks  in  daily  life.  Third,  natural  search 
targets  are  often  defined  by  a  conjunction  of  features  (e.g.,  “find  the  red  car-shaped  object”). 
Future  work  will  have  to  address  how  well  our  results  generalize  to  tasks  with  larger  set  sizes, 
free  viewing  conditions,  and  conjunction  targets.  Finally,  natural  scene  statistics  are  character¬ 
ized  by  complex,  high-dimensional  distributions,  making  simplified  approaches  difficult.  In 
particular,  the  stimuli  that  we  use  do  not  have  the  complexity  of  natural  stimuli.  In  a  naturalis¬ 
tic  model  of  simple  shapes  with  occlusion,  called  the  dead-leaves  model,  analytical  expressions 
have  been  derived  for  the  image  values  given  the  world  states  [33].  It  would  be  interesting  to 
examine  to  what  extent  human  observers  incorporate  such  statistics  in  their  decision-making. 


Supporting  Information 

SI  Appendix.  Optimal  decision  rule,  Model  fitting,  and  model  comparison. 

(DOCX) 

51  Fig.  Bayesian  information  criterion  results  parallelling  the  Akaike  information  criterion 
results  in  the  main  text.  Higher  values  mean  that  the  model  is  worse,  (a)  Companion  to  Fig  5. 
BIC  differences  between  the  EP  models  and  their  corresponding  VP  models  for  each  subject 
(left)  and  averaged  over  subjects  (right),  (b)  Companion  to  Fig  6a.  BIC  differences  between  the 
VP  models  and  the  VP4  (most  general)  model.  VP4  outperforms  VP1,  VP2,  and  VP3  by  26 
±13, 103±38,  and  47±16  respectively,  (c)  Companion  to  Fig  10a.  BIC  differences  between  the 
VP  models  and  the  VP5  model  for  each  subject  (left)  and  averaged  across  subjects  (right).  The 
VP5  model  outperforms  the  VP1,  VP2,  VP3,  and  VP4  models  by  66±25, 142±52,  86±34,  and 
39±22  respectively. 

(TIF) 

52  Fig.  Model  recovery  analysis.  Results  of  model  comparisons  obtained  by  comparing  the  fits 
of  the  four  VP  models  (rows)  to  data  generated  by  each  model  (columns).  The  color  and  num¬ 
ber  in  a  cell  indicate  a  model’s  AIC  (a)  or  BIC  (b)  value  relative  to  the  best  fitting  model.  A 
value  of  zero  on  the  diagonal  indicates  that  the  model  used  to  generate  the  data  was  correctly 
found  to  be  the  most  likely  model  to  have  generated  those  data. 

(TIF) 
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51  Table.  Parameter  estimates.  Means  and  standard  error  means  of  the  maximum-likelihood 
estimates  of  all  parameters  in  all  models,  as  well  the  tested  ranges  of  the  parameters. 

(DOCX) 

52  Table.  Parameter  recovery  analysis  for  the  VP4  model.  Mean,  standard  error  mean,  and 
95%  confidence  interval  for  passumed  estimates. 

(DOCX) 
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